This folder contains results from the PCA merging algorithm applied on the CNV profile from 5612 Colaus (Cohorte Lausanne) individuals.
Such merge aims at identifying genomic regions where the CNV profile is similar across the individuals.
Thus these results summarize the copy number variation within the Swiss population.

Results are provided in the hg18 (build36) genome assembly.


Four distinct methods have been used to predict the CNV profile :
- CNAT.allelic
- CNAT.total
- Circular Binary Segmentation (CBS)
- a Gaussian Mixture Modelling (GMM)



Results are available as :

1- flat file (.tab files)

for e.g.

chr	start	end	#SNPs	cnvFreq	dupFreq	delFreq
1	742429	789326	6	5.616	5.403	0.213
1	993492	1500664	13	0.569	0.533	0.036
The first 3 columns define the region coordinates (build hg18),
#SNPs defines the number of Affymetrix 500K SNPs that compose the region
the last 3 columns define the cnv, duplication and deletion frequency of the region,
i.e. how the percentage of colaus individuals (out of 5612) that have the region either 
- as deleted or duplicated : cnvFreq
- as duplicated only : dupFreq
- as deleted only : delFreq

Files are tab-delimited and can easily be uploaded in Spreadsheet softwares (Excel, OpenOffice), in R and many other softwares.

2- BED file

BED files are useful for visualization in the UCSC genome browser,
file can be uplodaded from http://genome.ucsc.edu/cgi-bin/hgGateway ("add track" button)
Please ensure the selected assembly is Mar. 2006 (NCBI36/hg18)

Once a file is uploaded, three tracks can be browsed :
- CNV track in gray
- duplication track in blue
- and a deletion track in red
The name of corresponding method is displayed in the track name.
Each bar corresponds to a given PCA merged region and its heights indicates the percentage of CoLaus individuals composing the region.
(see the flat file section for CNV frequency definitions)